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ABSTRACT 

Four presentations from the 1985 Annual Michigan 
School Testing Conference on "Assessing Higher Order Skills" are 
offered in this paper, and the chairman of the First General Session 
provides an introductory section. The papers individually and 
collectively address the problem of defining higher order thinking 
skills. A second major question facing those interested in teaching 
and testing thinking skills involves whether such skills should be 
taught and tested as a separate subject area or embedded and infused 
in existing subject matter. The paper by Michael H. Keen offers a 
concise treatment of the major questions facing those who would 
embark on the teaching and testing of higher order thinking skills. 
Edward D. Roeber and Betty L. Stevens describe the activities in 
Michigan during the planning and development stage for testing higher 
order skills, and outline the alternative approaches being considered 
by state level decision makers. Joan Boykoff Baron's paper provides 
an analysis of Connecticut's experiences in implementing a higher 
order thinking skills component in an ongoing assessment program. 
John Fremer and Mark Daniel provide a recapitulation of problems and 
prospects through a discussion of several recent developments in the 
assessment of higher order thinking skills. (LMO) 
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WHAT'S GOING ON IN THE ASSESSMENT OF HIGHER ORDER SKILLS? 

C. Philip Kearney 
The University of Michigan 

A aimple and atraight-forward anaver ia^ *A great dealt" As 

Freiner and Daniel ^ the authors of the fourth paper in thia 

collection^ point out^ concern with the teaching and testing of 

higher order thinking skills is fast taking on the 

characteristics of a major educational reform movement. Several 

■tates are developing and implementing assessment programs aimed 

at higher order thinking skills^ textbook publishers and testing 

companies are becoming increasingly active in this arena^ and 

conferences centered on this topic are springing up across the 

Nation. One of these conferences vas the 1985 Annual Michigan 

School Ti ating Conference vhich took as its theme^ "Assessing 

Higher Order Skills. " The First General Session of the 

Conference vas built around four presentations vhich addressed 

the title question of this piece, "What's Going on in the 

Assessment of Higher Order Skills?" ERIC/TME felt that these 

four presentations merited a vider audience and, consequently, 

asked the four prftsenters to prepare for inclusion in this 

present ERIC/TME publication.* Because I chaired the session, I 

vas asked to prepare this brief introductory piece. 

The four papers, properly, do not attempt to provide 

definitive anavers on vhat constitutes so-called higher order 

thinking skills, on hov they should be taught, or on hov they 

should be assessed. But the papera do offer a base of 
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information from vhich the reader can begin to form her or his 
own tentative ansvers to these questions. 

The papers individually and collectively address, but do not 
resolve, the first major problem facing those interested and 
involved in this arena, namely, the problem of defining higher 
order thinking skills. As each of the authors implicitly or 
explicitly demonstrates, there is no firm consensus on vhat 
should be included or excluded under the higher order thinking 
skills rubric. For the parent, as Keen suggests, the answer is 
easy: *What I vant is for you to teach my child to think.** F^r 
the professional, the ansver is much more complex. It includes 
such notions as a habit of reflective thinking: a disposition 
or willingness to think critically, assertively, and habitually; 
more difficult subject matter content? criti.cal reasoning 
skills; skills that go beyond straight recall or learning of 
facts; and a literal laundry list of other cognitive activities. 

Neither do the papers offer a definitive resolution of a 
second major question facing those interested in teaching and 
testing higher order thinking skills, namely, whether they 
should be taught and t«r8t9d as a separate subject area or 
embedded and infused in existing subject matter and tested in 
like fashion. While the papers appear to have a bias toward the 
embedded and infused approach, it appears still to be a question 
lacking a clear cut answer. Keen strongly advocates embedding 
thinking skills in every subject. Baron tells us thac 
Connecticut has embraced both approaches in its assessment 



efforts. Roeber tells us that Michigan has yet to resolve the 
question completely. Fremer and Daniel point out that there are 
still clear differences of opinion on the question in the 
instructional and measurement communities. 

The reader also will become avare of a number of other 
questions facing those vho vould develop programs to assess 
higher order thinking skills, including vhether to use a 
"one-tiered" or "two-tiered" approach in fashioning the program, 
vhether the benefits of using multiple approaches to measuring 
these skills outweigh the costs, vhether every-pupil testing or 
matrix sampling Is called for, vhether there is a need for 
considerable test development vork, or vhether a number of 
instruments that could be used in measuring these skills is 
already available. 

The paper by Roeber nicely summarizes the basic differences 
in the answers being provided to these questions by those in 
Michigan vho advocate the teaching and testing of higher order 
thinking skills. Roeber 's paper also offers a picture of what 
is going on in a State vhich, vhile it has had a state 
sssessment progra^A for a number of years, is only now setting 
out in a systematic vay to include higher order thinking skills 
in its assessment program. Baron's paper capitalizes on 
Connecticut's experiences vith the inclusion of higher order 
skills in a state assessment program and shares vith the reader 
the lessons learned from those experiences. Taken together, the 
two papers offer succinct descriptions of vhat's going on in the 



assessment of highar order skills in tvo states. 

Kean's paper and the paper by Fremer and Daniel provide the 
reader insights gained from persons vitally interested in the 
teaching and testing of higher order thinking skills because of 
their current roles vith major test publishing firms^ as well as 
their ongoing roles as members of the professional measurement 
community. Their experiences in vorking vith instructional and 
measurement practitioners charged vith developing large-scale 
assessment programs lend a practical flavor to their vievs on 
this important topic. 

We suggest that the reader read the four pieces in the order 
that they are presented. Kean's paper, in our viev, offers a 
concise treatment of the major questions facing those vho vould 
embark on the teaching and testing of higher order thinking 
skills. Roeber's paper describes the activities of a State 
still in the planning and development stage, and the nature of 
the alternative approaches being considered by state level 
decision makers. Baron's paper provides the reader benefit of 
Connecticut's experiences in implementing a higher order 
thinking skills component in its ongoing assessment program and 
the lessons that vere learned from those experiences. Fremer 's 
and Daniel's paper, vhile not necessarily vritten for that 
purpose, provides a good recapitulation of problems and 
prospects through its discussion of several recent developments 
in the assessment of higher order thinking skills. 



As stated above, the reader is not offered definitive answers 
to the questions of vhat constitutes higher order thinking 
skills, of hov they should be taught, or of hov they should be 
tested. The careful reader, hovever, is offered a solid base of 
information fron which she or he can drav some tentative*-end we 
would stress tentative- -answers to these questions. 



•Thomas H. Fisher, Director, Student Assessment Program* 
Florida State Department of Education, was one of the four 
presenters at the 1985 Annual Michigan School Testing 
Conference. Unfortunately, because of other demands, he was not 
able to prepare a paper on his presentation for inclusion in the 
present collection. Edward Roeber, who is immediately 
responsible for Michigan's efforts in this area, graciously 
consented to fill in for Dr. Fisher and prepared a paper 
describing Michigan's current activities in the assessment of 
higher order thinking skills. 
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ASSESSING HIGHER ORDER THINKING SKILLS: 
AN OVERVIEW OF THE ISSUES 

Michael H. Kean 
CTB/HcGrav-Hill 



In discuaaing higher order thinking akilla^ I plan to address: 

1. What they are 

2. How they might be taught and measured^ and 

3. Whether all the attention being paid to them vill 
result in changes of substance in education or result 
in just another passing fad. 

Before I do all that, though, I should like to reference several 

comments that I think are pertinent to the issues at hand. The 

first is from Bill Honig, the California State Superintendent of 

Public Instruction, and has to do vith hov ve arrived at vhat now 

seems to be a crisis in the teaching of higher order skills. 

Dr. Honlg says: 

In the '60s and '70s, ve told kids, "you make up your mind 
as to what's relevant and fun and study that." That was 
an abdication of our role as educators. Then, vhen people 
didn't think kids vere learning anything, ve went back to 
basics. The public never misinterpreted vhat back to 
basics meant- -history, literature, science, writing, high 
expectations, homevork, order in the classroom- -but 
educators did. What educators did vas narrov the, 
curriculum dovn to basic skills. 

And vhat vas the result of that narroving? Ray Cortines, the 

Superintendent of the San Jose < California) Urified School 

District, characterized it rather nicely, vhen he stated: "With 

the return to basics, ve screved off the kids' heads, poured in 

the information, and asked them to regurgitate the information by 

asking questions at the end of the week. But ve didn't teach them 

how to use that information . * 
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Public expectations are difficult to gauge. When we taught 
students what they said they wanted to learn^ the public wasn't 
happy. When we taught the students what we thought the public 
said we should be teaching^ it turned out not to be sufficient. 
Now we are being asked to teach something called "higher order 
skills. " 



1. What are higher order skills? 

Following is a brief list of some of the skills and attributes 
that various authorities have identified as constituting higher 
order thinking skills: 

o Comparing and contrasting 

o Making inferences 

o Analyzing events 

o Synthesizing information 

o Drawing conclusions 

o Identifying the problem 

o Analyzing the problem 

o Suggesting possible solutions to the problem 

o Testing consequences of possible solutions 

o Assessing the reliability^ relevance^ sufficiency, 
validity, and meaning of data 

o Analyzing arguments 

o Judging credibility of sources 

o Observing and Judging observations and reports 

o Induction 
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o Deduction 

o AMumption identification 

o Prediction 

o Identification of fallacies 

o Definition of problem 

o Distinguish between differences of kind and differences 
of degree 

o Understanding verbal analogies 

o Selection of a solution process 

o Selection of a vay of representing a solution 

o Selection of a problem-solving strategy 

o Allocation of processing tine 

o Sensitivity to feedback 

o Translation of feedback into an action plan 

o Implemi^ntation of an action plan 

o Testing hypotheses 

o Linear reasoning 

o Data gathering 

o Decision making 

o Classifying 

o Organizing 

o Identifying alternative points of viev 

o Recalling 

o Grouping/labeling 

o Classifying/categorizing 

o Ordering 

o Patterning 

o Prioritizing 
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The list is even longer^ but I do not think the point needs to be 
belabored: there is a certain lack of consensus among educators 
as to vhat higher order skills are. 

Probably the average lay parent vould have less trouble 
defining vhat she or he thinks should go on in public schools. 
"What I want^ " a parent might say^ "is for you to teach my child 
to think. • 

What that average parent might not say^ but vhat they vould 
almost surely also vant^ is for the child to be taught to think 
critically* assertively, and habitually ; that is to be a thinking 
being^ not Just a pliant subject capable of displaying certain 
behaviors on cue in an academic environment. 

Harvey Siegel^ Professor of History and Philosophy of 

Education at the University of Nebraska^ had some interesting 

things to say about critical thinking in the Novemher 1980 issue 

of The Educational Forum t 

"..«it is not enough for a student to be able to evaluate 
claims on the basis of evidence. . . In order to be a critical 
thinker^ a student must be disposed to do so. A critical 
thinker must have a villiraness to conform Judgement to 
principle^ not simply an ability to so conform. " 

In the same article^ Dr. Siegel says that students have a "right 

to question^ to challenge^ and to demand reasons and 

Justifications for vhat is being taught. " Those tvo quotations 

have some interesting implications. 

The first suggests that the apparent failure of our schools to 

produce thinking beings may have at least as much to do vith the 
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general environraent they provide^ aa vlth the specific curricula 
they teach; for surely "disposition" and "villingnesa" are not 
explicitly taught commodities. I'll return to th^t point shortly* 
but vould first like to examine some of the implications of that 
second quotation^ the one about the students' right to question* 
to challenge^ and to demand. 

I vant to suggest that there may be less of a constituency out 
there in the world at large^ and even vithin the educational 
community^ for rational^ thinking beings^ than we as educators 
might like to believe. If ve succeed in teaching students to 
thinks ve cannot expect that they vill limit their thinking to 
prescribed subject matter. We must expect^ rather^ that they vill 
question us and challenge us and demand of us that ve Justify our 
positions on any number of issues from curriculum content to dress 
codes. Thinking students can^ in shorty be very inconvenient 
students. 

If ve as educators^ vhoae business it is to train young minds* 
are not entirely sure ve vant to deal vith rational beings* hov 
much more likely is it that very considerable segments of society 
at large may in fact be angry rather than grateful if ve should 
ever succeed in graduating a generation of truly rational 
students? I do not mean to be overly negative; nor do I mean to 
suggest that efforts to improve students' thinking skills are 
either undesirable or impossible. Quite the contrary--if there is 
in fact some degree of ant i -rational bias both in our education 
system and in society at large^ it is all the more incumbent upon 



ua to find weya to teach atudenta to overcome that biaa. It la 
."^ mportantp though^ that ve be honeat with ouraelvea about what we 
are trying to accompliah. If ve delude ouraelvea that we can 
teach higher order thinking akills aa Juat another chunk of 
curriculum^ to be drilled like the multiplication tables^ our 
efforta will fail. 

2. How then should thinking akilla be taught? 

At thia point, I'd like to compare and contraat two aubject 
areaa that have been getting quite a bit of preas lately: higher 
order akilla and computer literacy. At the moment, both computer 
literacy and higher order akilla curricula are rather trendy 
aubjecta. Both have many buzz worda aaaociated with them, and 
both have a certain air of newneaa. 

Of the two, though, only computer literacy ia genuinely new. 
No one haa, to my knowledge^ auggeated that the public achoola did 
a better Job of teaching computer literacy a decade or a 
generation ago than they do now. The public fear, in connection 
with computer literacy, ia that the schoola may be failing to keep 
pace with brand new developmenta, not that they are becoming 
deficient at aomething that they uaed to do well. 

Higher order thinking akilla ar^ an entirely different matter, 
howevpr^. It ia auggeated that a decade or a generation ago 
achoola did a better Job of teaching than they do now. And yet, I 
do not think that any large number of public achoola ever 
explicitly taught thinking akilla until quite recently. I don't 
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think it occurred to very many people that thinking skills needed 
to be taught. 

It is a rare person indeed who can progress very far in 
computer literacy without at least some formal instruction. We 
see in the microcomputer a device with definite characteristics 
that must be explicitly learned. We do not blame ourselves if » in 
the absence of instruction^ ve are unable to make much use of 
computers. 

Thinking^ on the other hand^ is something that most people do 
remarkably well without any formal instruction. That is not to 
say, of course, that we could not all improve our thinking skills 
with formal instruction; but it is to suggest that, in the case of 
thinkiny skills, we are dealing with something quite different 
from other subjects in our curricula. 

If the schools of the past did not explicitly teach thinking 
skills, yet managed to turn out reasonably good thinkers, what did 
they do that the schools are not now doing? For one thing, they 
simply existed at a time when reason was held in higher repute 
than it now is. I cannot prove that, but it's worth considering. 
They also required a lot of writing, and writing is notorious as 
an instrument of thought. 

However, schools of the present cannot, in a direct and 
immediate way, control the spirit of the times in which they must 
function. Writing, for all its utility as a tool of thought, 
cannot be expected by itself to overcome students' deficiencies in 
thinking skills. So we are left with the proposition that 
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something must be done to teach thinking skills in the public 
schools. 

There are tvo fairly obvious vays to go about it. You can 
introduce into the curriculum a new subject with a nev and trendv 
naiTie^ or you can embed the teaching of thinking skills throughout 
the existing curriculum. There is ample evidence in the 
literature that either approach can be made to vork. There are 
several dangers in the first approach. For example^ it is easy to 
overload the system itself. There is only time in the day for so 
many subjects; introducing a nev one may cost an old one. 

In addition^ although there is evidence for the efficiency of 
teaching thinking by teaching about thinking^ by making thinking 
itself a subject like English or math» the risk is run that 
teachers and students alike vill treat thinking in the same 
unproductive vays that they have sometimes treated other subjects. 
The teacher vill drill into the students' heads the fourteen steps 
of critical thinking^ and the students vill dutifully list those 
steps on the next quiz» without bothering ever to apply them to 
any other aspect of their lives^ in or out of school. Final ly» by 
isolating thinking skills as a separate item in the curriculum^ 
you make them a likely target for the first "no frills" budgc?t 
cutter vho comes along. 

By embedding the teaching of thinking skills in every subject^ 
on the other hand: 

1. You are likely to take less time away from subject-area 
studies; 

2. You give students more opportunity to apply the thinking 
skills they learn in diverse situations; 
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3. You effectively foreatall the possibility that thinking 

skills will be deleted from the curriculum the moment they 
are no longer a *hot item. ** 

Embedding the teaching of thinking skills into every subject 
area also increases the likelihood that all students, regardless 
of achievement level, will benefit. Higher order skills should 
not be considered the special province of the gifted and talented. 

Whatever method is used to place thinking skills into the 
curriculum, it is important that ve not lose sight of environment 
and attitudes. No amount of explicit teaching of thinking skills 
vill ever overcome implicit environmental clues telling students 
that independent thinking, far from being valued, is likely to get 
them into trouble. A teacher or an entire school system that is 
unvilling to entertain serious questions about its goals and 
methods, or allow open discussion of issues of importance raised 
by students- -such as censorship of school book lists- -is unlikely 
to produce a crop of questioning students. Thinking skills must 
not only be explicitly taught, they must be practiced; they must 
be exemplified in the behavior of teachers and administrators; and 
they must be valued in students. When these criteria are met, we 
may expect to see students disposed to evaluate claims on the 
basis of evidence and willing to conform Judgment to principle. 

Assuming then that we are agreed that higher order thinking 
skills can, at least in some degree, be taught, we come to the 
question of measurement. 




3. Can thinkinc ekilla be tested? 

There is no reason why they cannot. Though there are 
considerable differences betveen thinking skills and other skills 
taught in the sc ^ools^ the fact remains that thinking skills^ 
though perhaps not themselves observable^ when exercised^ produce 
observable outcomes; and observable outcomes can be measured. 

Admittedly, not all authorities take the same view on the 
subject. In the March 1984 issue of Phi Delta Kappan. for 
instance^ Bar > K. Beyer stated that: "The best measure of 
students' ability to think may be their behavior as they sift 
through data to arrive at a conclusion or as they go about solving 
a problem. The development of instruments or observation 
techniques that can measure such behavior ought to be a major 
priority of test makers. " There is no reason to believe^ however^ 
that stcndard multiple-choice items cannot be constructed in such 
a vay that they can only be correctly answered by engaging in the 
kinds of higher order thinking skills that have been discussed. 
Why shouldn't students' ability to engage in those skills be 
assessed with existing instruments and with instrumtrnts that can 
be fairly readily produced? 

Kovertheleas^ I think development of test instruments of the 
type Dr. Beyer advocates might be a very good thing indeed. Such 
instruments might well provide a more useful degree of uiagnostic 
inf'jrmation than is presently available. Having the information 
thet a student is deficient in^ bby, deduction^ is of limited 
value if you do not know what actual subprocesses to attack in 
remedying the deficiency. I do not think it is either necessary 
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or advisable^ however^ for the education community to wait for 
instruments that may or may not soon be available^ given that we 
are faced with a critical problem and already have at our disposal 
some useful tools with which to begin to attack that problem. 

I would suggest^ for example^ a norm-referenced achievement 
test such as the California Achievement Tests (CAT), Forms E and 
P. Even though these tests are designed primarily to measure the 
most commonly taught basic skills, there are many items throughout 
the series that measure higher order thinking skills. 

These items measure more than recall of facts or answering 
questions based on the information provided. The items were 
developed to require students to analyze, to synthesize and to 
interpret the information provided. Studies will be done during 
the standardization of CAT E and F to determine what kinds of 
valid scores or results can be reported on these items. In 
addition, CAT E and F has been developed so that there is a better 
probability that reliable and valid information can be obtained 
for higher scoring students. Additional items have been included 
at the difficult end of the range to minimize the chance of 
students "topping out. " CAT E and F will also provide 
End-of -Course tests at the secondary level for students taking 
specific courses in algebra, geometry, physics, biology, world 
history, American history, computer literacy, and consumer skills. 

While CAT E and F is still primarily designed as a measure of 
basic skills, procedures and information have been built in to 
also provide useful and valid information on higher order skills. 
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In Conclusion 

To cloae, it may be useful to very briefly reiterate several 
of the points made earlier. 

Is it neceasarv explicitly to teach higher order thinking 
skills? Given the mounting evidence that our students are 
deficient in such skills, I think the ansver is clearly yes. 

Is it, in fact, possible to teach and test such skills? 
Considerable evidence suggests that it is. 

Is it sufficient explicitly to teach higher order thinking 
skills? Absolutely aoti I think the education community needs to 
take a hard look at whether or not it provides an environment in 
which thinking skills, once acquired, can flourish. Providing 
that environment is in the long run at least as important as any 
formal teaching, testing, and remediation we can provide. 




•17- 



DEVELOPING MICHIGAN'S ASSESSMENT OF THINKING SKILLS: 
DEVELOPING MICHIGAN'S PROGRAM 

Edward D. Roeber and Betty L. Stevens 
Michigan Department of Education 

In July 1984, the Michigan Department of Education vas 
funded to investigate and plan a higher-level assessment 
program. Specifically, the Department budget bill included the 
following language: "...develop advanced skills tests for use 
in grades four, seven, and ten in the areas of language arts and 
mathematics. . . " Although Legislative intent was clearly to 
develop more difficult assessment tests, staff of the Michigan 
Educational Assessment Program (MEAP) have also explored the 
possibility of including tests of higher order thinking skills. 
The following is a description of the current HEAP status of and 
an examination of how it might be changed, what might be 
changed^ what might be tested in the future, and issues which 
must be addressed. As with any developmental project, what 
emerges in a year or two may bear little resemblence to current 
plans. 

The Current Assessment Program 

The current MEAP program assesses all students in grades 
four, seven, and ten in the areas of mathematics and reading. 
This program has been in existence since 1969-70. Results ot 
the MEAP Program are used to help students make up skill 
deficiencies, as well as to provide schools with a point of 
departure in reviewing and revising their curricula in these 



areas* Scores of individual students are not used in promotion 
or graduation decisions. Over the years^ scores on the tests 
have improved considerably^ most notably in ' > areas of 
reading. 

Because results are reported in the newspapers^ school 
personnel^ parents^ and the general public are very sensitive 
about information vhich may reflect negatively on individual 
schools or local districts. This concern often stimulates 
school districts to take steps to improve student performance by 
making changes in school programs. Staff of the Department 
(assessment^ instruction^ compensatory education^ and so forth) 
spend a considerable amount of time assisting local districts to 
use the results appropriately^ as well as to report them in a 
useful manner. 

Forces For Change 

A major force for change of IIEAP^ of course^ has been the 
spate of reports on the condition of education nationally and in 
Michigan. A number of these have proposed using testing not 
only as a vehicle to monitor student achievement but also as 
stimulus for educational reform. In Michigan^ for example^ a 
special report <Sederburg L Rudman^ 19S4) was prepared that 
examined changes in performance for various subgroups of 
students^ particularly at the high school levels where 
comparative data on students in Michigan and the nation is 
available using college-entrance tests r>uch as the SAT. This 
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report vas written in response to A Nation At Risk and Michigan 
State Board of Education plan for the future (A Blueprint for 
Action^ 1984 vhich included recommendations made by the 
Michigan High School Commission. The folloving is taken from 
the summary of the Sederburg and Rudman report : 

Over the past fev yeara^ state and federal educational 
policy has targeted the lower achieving student. This 
targeting of funds and effort has yielded results. 
However, it is apparent that, at the same time, we may 
have neglected the better achieving student. In contrast 
to the prevailing belief, the brightest students have 
not succeeded regardless of the educational system. 

Consequently, we are calling for a shift in educational 
policy. We must create an educational system that 
challenges all young people and develops students to 
the best of their abilities. Emphasis on testing for 
basic skills for high school graduation and grade pro- 
motion reinforce the attitude that teachers and 
administrators should be most concerned with the lower 
achieving student. While it is worthwhile to insure that 
all students possess "essential" skills before graduation, 
we must not overlook the student who is not challenged 
by minimal objectives. 

The recent proposals made by the State Board of Education 
go a long way toward accomplishing the goals outlined here. 
However, the entire focus must be shifted away from minimal 
skills which tend to bring high achievers down while trying 
to bring everyone up to the highest level possible. The 
State Board and the legislature will need to clarify 
their philosophical direction as well as set specific 
goals for whatever educational reform they wish to 
achieve in the 1980 's. 

Proposals For Change 

The Sederburg and Rudman paper contained the first proposals 
for developing a higher-level teat. Although the State Board of 
Education's report included changes for the assessment program, 
such changes dealt only with broadening the scope of HEAP to 
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include periodic^ every-pupil testing of other subj^ect areas 
Including Healthy Science^ Career Development, and Social 
Studies. The Sederburg-Rudman article, however, dealt 
apeclflcally vlth higher-level asaessment by suggesting, among 
other things, that : 



1. The testing progi ^m of the State Board of 
Education should be changed tc adequately measure 
all Michigan students, not Just those below the 
achievement level determined by the State. 

a. The State Board should establish a 
qualified task force to develop such 
a testing program. 

b. The legislature should mandate this 
testing program through the budget of 
the Department of Education. 

2. The State Board of Education set achievement goals 
to be attained by all achievement classifications 
by a specific date. In their "Blueprint for 
Action" the State Board calls on local boards to 
initiate a 3-5 year plan to improve achievement. 
Similarly, the Board should set st»te goals to 
improve all categories of Hichigai. youngsters. 

3. State policy should reflect an effort to pressure 
local school districts to provide programming for 
the entire spectrum of students. The state 
testing program should be used to validate or 
accredit local school diplomas for all students. 

a. Achievement tests administered as 
early as the tenth grade should point 
to areas for potential remediation. 
The 10th grade test should emphasize 
reading, language and basic math 
skills. 

b. An 11th grade exam should include 
physical science, biological science^ 
and social science. The 12th grade 
year would be used to assist students 
who did not meet essential skills in 
the 10th and 11th grade exams. 

c. The State Board of Education should 
use these tests as the basis for 
accrediting high school diplomas. 
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A response to tne Sederburg and Rudman paper by the 



Department of Education suggested possible dirc^r \ion for the 
HEAP Program: 



The other vay in vhich HEAP may change in coming years is to 
asfless students beyond the basic skill level. This 
discussion pr(*>sumes tha^ (1) testing hasic skills ±b valid 
and will still be carried out, (2) testing high&r-level 
skills should emphasize the same purposes as the regular 
HEAP program (i. e. , individual student assistance, curricula 
review and revision, reporting to various audiences), (3) 
students should be identified based on their basic skill 
achievement, (4) such higher-level skills are either more 
difficult subject matter content, critic >1 reasoning skills 
or higher-level thinking skills (e.g., analysis, synthesis 
and evaluation from Bloom's Taxonomy), and 5) the students 
identified can be offered a school program which mee^.s their 
educational needs, even as schools are helping students who 
have not as yet achieved the ininimums. The presumption is 
that schools (and the State) can emphasize both "be«ic" 
skills and "advanced" skills and not have to choose one over 
the other (Roeber, 1984). 



HEAP staff proposed a plan that included a two- tier 
approach, with all 4th, 7th, and IQth grade students taking the 
basic skill level and those that passed, the higher-level 
examination. It was proposed that advanced tests be developed 
at three levels (grades 4-6, given in seventh grade; grades 7-9, 
given in 10th qrade; and grades 10-12, given in grades ten, 
eleven, and twelve). Staff also developed a list of technical 
and policy issues for testing beyond the basic skills. 

The Departmert plan was presented to the State Board of 
Education in early 1985. After considerable discussion, the 
State Board approved the HEAP staff plan for fa higher-level 
assessment program and directed that e ^Ludy group be convened 
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to examine issues and to develop a tentative assessment plan. 



Developing the Plan for The Higher Level Assessment Program 

Since late 1984, Department staff have been meeting with a 
planning group consisting o£ local and intermediate district 
educators, college and university specialists and others. 
Represented on vhe group are gifted educators, assessment and 
curriculum specialists, content area specialists (e.g., science, 
reading ) , and administrators. 

The Higher Level Assessment Committee has spent a 
considerable amount of time discussing methods to address 
student needs, particularly those of students who already pass 
the current basic skills tests. Very early in these 
discussions, it was apparent that there were sharp differences 
of opinion regarding the direction HEAP should take. Some 
members of the advisory group, for example, proposed toughening 
the current content standards t*?sted in HEAP. Others, however, 
suggested that tests of critical thinki .g, critical reasoning, 
or thinking skills be used. 

The group has been pursuing both options. Discussions have 
focused on what "tougher" standards really mean, how 
higher*order thinking could be tested and how this program could 
mesh with the current basic rkills program. Others have been 
examining various approaches to teaching thinking skills, 
looking particularly at how thinking skills are defined and the 
implications for testing. While viewed originally as an 
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alternative to the current basic skill program (or^ at leasts a 
more difficult extension of it)» thinking skills is now viewed 
as a logical complement to the current program^ plus any new 
program which might be developed. 

With this background in mind^ the committee began to examine 
alternative approaches to the new assessment program. Members 
of the committee vere challenged to develop a new assessment 
program model. Thus far^ tvo plans have been suggested. The 
first (Rudman, 1985) is much different than the second (Downing^ 
Johnson -Lewis^ Leddick, Lohr^ Stevens, 1965). Each is described 
more fully below. 

The Rudman plan proposed a different approach than proposed 
earlier by Sederburg and Rudman. The new plan is predicted on 
seven assumptions: 

1. The power of state-mandated assessment programs has 
been convincingly demonstrated to be a force in 
instituting instructional change within the schools. 

2. Higher order reasoning skills can best be taught as 
an integral part of some specified body of knowledge. 

3. There is a demonstrable relationship between 
focused instruction and student performance at 
all levels of ability. 

4. Schools can be effective if the mandate given them 
is strong enough and if adequate resources are 
available. 

5. There is a limit to the amount of resources^-human and 
fiscal--that are available for education. 

6. The schools are an important instrument in affecting 
social and economic policies of a nation. 

7. Recommendations for reform of any institution^ 
including the schools^ must be based on a reasonable 
expectation of stability of public policy (pg. 25). 
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Rudman goes on to make four recommendations: 

1. The State should develop a plan which incorporates 
a two-tier evaluation of student academic status; 

2. The State should establish g standard setting 
advisory comirittee i 

3. The State should assume a major portion of the 
funding for mandated assessments ; 

4. The State Department of Education should establish 
a technical advisory committee to determine test 
specif icationsp set criteria for selecting tests, 
and recommend tests or test contractors. 

Rudman suggests that the current Michigan Educational 
Assessment Program should be "mandated on a matrix*sampling 
basis rather than an every-pupil requirement. ... Matrix sampling 
could yield useful information for public monitoring of minimal 
achievement within the state's schools while at the same time 
reducing the amount and testing time. ... He further suggests that 
this program be administered at grades 4» 7 and S, with 
concentration on the first two levels of Bloom's taxonomy 
(Knowledge L Comprehension). 

Rudman also recommends that a second tier of testing should 
be undertaken by the Michigan Department of Education. This 
testing should be mandated on an every-pupil basis in grades 1, 
3, 5, B, 10 and 11. The content of these tests should inrlude 
much more than Reeding and Mathematics. It should measure the 
language skills^ social sciences^ science, and listening skills 
at the appropriate grade levels. The content of these tests 
should consist of levels 1 to 4-*- of the Bloom Taxonomy 
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(depending upon the grade level at which the test is 
administered). It should be so constructed that there are 
sufficient items at a variety of difficulty levels from .30-. 90. 

The other proposal under consideration by the Higher-Level 
Assessment Committee vas prepared by a subgroup of the Committee 
(Downing^ Johnson-ievis^ Leddick^ Lohr and Stevens^ 19SS). This 
proposal is considerably different from the Rudman plan^ in that 
it suggests that every pupil should be included in the same 
testing program. 

Five basic assumptions underlie this approach. They are: 



1. The Processing of knowledge is critical to the current 
information society^ therefore the accumulation of 
information must be accompanied by increased emphasis on 
problem finding^ problem solving^ critical thinking and 
decision making. 

2. For some students^ employing higher order thinking may 
lead to more successful acquisition of basic skills. 

3. A state mandated assessment program can and does drive 
curriculum in nev directions. 

4. Focused instruction results in acquisition of identified 
skills. 

5. Test construction should not be attempted without 
consideration of program implementation and acceptance 
factors. Any new program must be built on what 
currently exists. 

The subgroup has recommended that the State continue a one 

tier assessment program to evaluate student academic progress. 

Within this program^ however^ it is recommended that the 

existing assessment program be expanded to include: 



1. "essential skills" as these not only subsume basic 
skills but can expand to include areas of greater 
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difficulty; 

2. a writing coitiponent which focuses on higher order 
thinking ; 

3. an indicator test to assess content of specific skills 
involved in problem finding^ problem solving^ critical 
thinking and decision making. 

In order to articulate the new HEAP for educators and the 
public in general^ it is recommended that changes occur in a 
phased approach as follows; 

In Phase I^ the current HEAP would be expanded to include a 
measurement of thinking processes identified in accordance with 
Bloom's taxonomy and essential skills in the areas of Reading^ 
Mathematics and Writing. Students would be tested in grades 4, 
7 and 10. 

Phase II would replicate Phase I mnd, in addition, an 
indicator test would be administered to students in Grades 3, 6 
and 9. The indicator would delineate the skills of 
problem-finding, problem*solving, critical thinking and 
decision -making. 

Phase III would replicate Phase II (i.e., the indicator test 
would continue to« be administered) but the indicator skills 
would be measured on the Grades 4, 7 and L(d essential testss. 

Phase IV would be the same as the preceding Phase III. This 
final phase may include a Grade 12 test where high school 
subject content would be assessed. 

As the Committee discussed these two plans along with the 
original two papers, a list of issues has emerged. These issues 
are shown below, as well as the initial "votes" of the 
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Committee. The list of issues will form the basis of future 
discussions of the Advisory Committee. 



ISSUES OPINIONS OF COMMITTEE 



1. Should thinking be tested? 12-Yes 0-No 

2. If yes, should thinking be tested: 

a. as as separate content area? 0-Yes 7-No 

b. within the subject matter content? 
i. e. $ within science, social 

studies, etc, ? a-Yee 0-No 

c. a combination of a & b? S-Yes 4-No 

3. Should there be a two tier test? 7-Yes 4-No 
(If no, go to 6) 

4. If Yes to 3: 

a. Should tier 1 test only essential 
subject matter and tier 2 test only 

thinking skills? 0-Yes 5-No 

b. Should tier 1 test essential subject 
matter and thiking skills and tier 2 

test harder subject matter? 0-Yes S-No 

c. Should tier 1 test essential subject 
matter and thinking skills and tier 2 
test harder subject matter and 

thinking skills? 7-Yes 0-No 

5. If Yes to 3: 

a. Should all students take both tiers? 7-Yes 1-No 

b. Only those "passing** the 1st level 

take the 2nd level? 0-Yes 5~No 

c. Test all students on level 1 and 

sample test level 2? 0-Yes 3'No 

d. Sample test level 1 and test all 

students on level 2? l-Yes 2-No 

Should there be a writing assessment? 12- Yes 0-No 
If yes, how should it fit with the above? 
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Responses : 



Level 2 Test 

Compose a persuasive statement which relates 

thinking to content 
Given to all students in tier 2 
Every -pupil testing 
It could be done 1 of 2 ways: 

1. essays within content area 

2. in separate content area 



Completing the Plan 

The Higher Level Advisory Committee hopes to finalize the 
plan for a higher level assessment program by October, 1965. 
Once completed, the plan will be submitted to the State Board of 
Education for review and action. If the State Board of 
Education approves the plan, staff will immediately begin to 
present it to local educators throughout the State, and at the 
same time will begin to develop the specific llst(s) of skills 
to be measured. It is anticipated that it will take at least 
two years to finalize the list of skills and appropriate 
measures of them, and that it will be at least three years 
before a revised assessment program can be implemented. 
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ASSESSING HIGHER ORDER THINKING SKILLS IN CONNECTICUT: 
LESSONS FROM CONNECTICUT 

Joan Boykoff Baron 
Connecticut State Department of Education 

Since 1982, the Connecticut Assessment of Educational 
Progress (CAEP) program has been systematically integrating 
higher order thinking skills into its assessment of subject 
matter domains in grades 4, 6 and 11. To complement these 
efforts, the nev Connecticut Mastery Testing Program has 
incorporated many inferential and evaluative comprehension 
skills into its fourth grade reading test, and conceptual 
understanding and problem-solving skills into its fourth grade 
mathematics test (see Tirozzl et al,. 1985). This paper will 
first summarize vhat ve have learned about students' thinking 
skills vhen measured in the context of social studies and 
English language arts. Then, it vill summarize vhat ve have 
learned about hov to measure higher order thinking skills, 
discussing some of the current methods being explored and the 
challenges vhich lie ahead. 

The Performance of Connecticut Students 

In general, Connecticut students perform either the same as 
or sMghtly better than the national sample tested by the 
National Assessment of Educational Progress. Furthermore, 
Connecticut students in the early and mid 1980 's are performing 
about the same as they vere five years earlier. It is against 
this backdrop of rather typical and stable performance that ve 
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are confronting the disappointing results found on higher order 
thinking skills between 1982 and 1985. 

On our 1982-83 Social Studies Connecticut Assessment of 
Educational Progress (CSDE, 1984) students performed poorly on 
many items measuring higher order thinking skills. For example^ 
"students had difficulty in recognizing associations such as 
cause and effect when more thought than immediate recall was 
required^ in drawing conclusions from evidence and in 
interpreting data. " Five of the nine statewide recommendations 
in social studies pertained to thinking and are presented below: 



o Provide students with as many opportunities as 

possible to interpret information rather than merely 
recite it back in an identical form. 

o Encourage students to interpret information 

depicted in graphs^ charts^ and tables rather than 
simply read it. 

o Emphasize the personal relevance and modern day 
implications of social studies concepts. 

o Place greater emphasis on cause and effect 
relationships. 

o Incorporate problem solving and logical 
analysis in the context of social studies. 



Similar findings from our 1983-84 English Lanauage Arts 
Connecticut Assessment of Educational Progress were reported to 
the Connecticut State Board of Education t 



One finding that pervaded the assessment in 
reading^ literature^ listening^ study skills^ 
writing, and computer literacy was that students 
do well on the literal comprehension level and not 
so well at the higher levels of thinking. Our 
students have learned a lot. They have many facts 



at their command and they can solve simple 
one-step problems. However^ when they are asked 
to infer, integrate, and evaluate, performance 
drops. Furthermore, vhen they are asked to solve 
more complex problems involving the application of 
knowledge to new situations, the condensation of 
information, the synthesis of several pieces of 
information or the solving of problems requiring 
several steps, performance drops. In addition, 
when students are asked to develop and maintain a 
point of view and support it with reliable and 
sufficient evidence, performance is poor. 



The rest of this paper is devoted to some lessons we've 
learned on how to assess thinking skills and to a brief 
description of some of the challenges that lie ahead. 



Using Multiple Approaches to Assess Thinking Skills 

The first lesson we learned is about the importance of using 
multiple approaches to assess thinking skills. Frederickson 
(1984) alerts us to the bias inherent in relying solely on 
multiple choice items. In our two most recent CAEP assessments 
we used multiple approaches. In English language arts, for 
example, we measured writing skills with five approaches which 
included more than one hundred multiple-choice items, two direct 
measures of writing requiring writing samples from narrative and 
persuasive discourse modes, a dictation test, a note-taking 
exercise, and a revising and editing test in which students had 
to coirect ^rrorm made by others. Furthermore, in an attempt to 
be eclectic as well as thorough, we used holistic, primary 
trait, and analytic scoring rubrics to score our writing 
samples. Our experience clearly demonstrated that these three 
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scoring rubrics provide information of such varying levels of 
specificity about both content and mechanics that tho choice of 
scoring methods should be dictated by the purpose of the test 
and the degree to vhich the information vill be used by teachers 
to influence the instructional process (See Baron^ 19S4). 

It is unfortunate but true that different measures of the 
same trait using different methods often provide different 
results. For example^ spelling results using multiple choice 
items requiring students to select the one misspelled word from 
among four alternatives^ differ dramatically from spelling 
results generated from a paragraph densely laden with an 
unspecified number of errors vhich students have to locate and 
correct. (See Baron^ et. &1_. 1985 and CSDE 1985 for some 
examples. ) 

Fortunately^ sometimes different approaches yield 
corroborative results^ a particularly reassuring finding when 
one is preparing to embark on a major effort to remedy a 
problem. One example of corroborative data was found on our 
English language arts test when we used three approaches to 
measuring students' ability to recognize and provide good 
support in writing. Student performance was consistently 
disappointing. On a persuasive essay» at all three grade levels 
tested <4, and 11), fewer than 5 percent of the students were 
Judged to have provided enough support to convince a television 
critic to either write z,ore editorials like the one he had 
written or to take back what he had written. Fewer than five 



percent of the grade 8 students provided support that was judged 
adequately deep^ sufficiently credible^ or amply numerous. The 
grade 11 students performed slightly better vith 17 percent 
providing sufficient support to validate their position and 31 
percent explaining the stated reasons vith ample explanation 
On the revising test, vhere students vere specifically asked to 
provide support for purchasing school computers, the grade 8 
students outperformed the grade 11 students vith Just over a 
quarter of the grade 8 students providing tvo or more credible 
facts^ examples and/or reasons as support. (The corresponding 
number of eleventh grade students vas 12 percent. ) On a 
multiple choice item requiring students to recognize an essay's 
greatest weakness, only 40 percent of the grade 8 students 
identified the correct ansver, "It does not provide enough 
supporting examples. " (See Baron & Kallick, 1985 for some 
examples and CSDE, 1985 for a more detailed description of the 
findings. ) 

Our recently completed science assessment also used multiple 
assessment approaches. We measured the same concepts using 
multiple choice items, short essay questions, and a practical 
test which included short tasks like focusing a microscope, 
viring an electrical circuit to light a bulb, weighing, 
measuring and sorting objects, and conducting an experiment. In 
examining the data, the importance oi using multiple approaches 
was quite evident. Consider the multiple choice item provided 
in Exhibit 1. What conclusions might be drawn from the data 
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which aho¥B that 71 percent of the fourth grade students 
answered this item correctly^ as compared with just over 
half -^r the eighth and eleventh grade students? Our state? 
advisory com^ ittee generated lots of hypotheses ranging from 
skeptical suggestions like "it's an anomaly" co more 
optimiFtic ideas like "these fourth grade students must be 
getting the * hands-on experience' that many of the science 
experts in the state hav*-^ been advocating. " Imagine the 



Exhibit 1 



Percentage of Students 
Selecting Each Qption 



Gr. 4 


Gr. 8 


Gr. 11' 




17 


31 


30 


A 


71 


58 


57 


c»» 


8 


a 


9 


C 


4 


3 


3 


0 


1 


1 




No. 








reap 



Suppose that you want to drop a 
penny and a quarter at exactly 
the aaki.^ time and have them hit 
the floor at exactly the same 
time. Which oicture BEST shows 
how you would hold the penny and 
the quarter just before you drop 
them? 



® 



a 



r f 




B. 



D. I don't know. 



Example of science choice item measuring higher order thinking 
skill. 
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committee's surprise when they turned to the results of the 
practical exercises in vhich fourth grade students had been 
asked to predict what vould happen if a penny and a quarter vere 
dropped together and discovered that only 5 percent gave the 
correct response that they vould fall at the same rate. It then 
became clear that the students did not really understand the 
physics concept being measured by the multiple choice test item 
in Exhibit 1. 

To Infuse Thinking Skills into Subject Areas or Keep Them 
Separate? 

The second major lesson we've learned concerns the debate 
over whether to infuse thinking skills into the curriculum and 
the test or whether to teach and assess thinking skills 
separately. Perkins (1986) refers to this as the 
"gentr ality-power tradeoff." If you teach a broad skill, it 
wxll have wide generality to many areas but not much specific 
applicability to any particular subject area. On the other 
hand, if you teach a narrow skill, it will boost performance in 
that narrow area but have little applicability to other areas. 

Infusing Thinking Skills into Subject Area Curricula and Teetg> 

If one chooses to infuse thi/.king skills into the assessment 
of subject areas, Bloom's taxonomy (1956) can be very useful in 
designing tesv items if the taxonomy is used systematically. We 
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initially used Bloom's taxonomy to help us create a test that 
vould be balanced across social studies disciplines. On prior 
assessments^ students generally performed more poorly on some 
subsets of skills than others. There was always the temptation 
to conclude that students knew less about those areas. However, 
when experts scrutinized the various groups of items, often they 
could explain the results on the basis of the cognitive skills 
demanded. In order to avoid drawing inaccurate conclusions, we 
assigned to each item on the test a Bloom's taxonomic level and 
equally distributed the levels across the subsets of items that 
would be reported. In this way, if differences among the item 
groups emerged, the differences would not be a function of 
different cognitive skills. 

One of the findings worth noting is that contrary to popular 
Lelief, knowledge items are often the most difficult items on 
the test because of their sensitivity to instruction and recall. 
In order to get a knowledge item right, the student has both to 
have learned the information and to be able to recall it. 
Because there is no standardized statewide social studies 
curriculum or list of approved textbooks used in Connecticut, 
the likelihood of all students being exposed to any particular 
piece of information is low. Even if they had been exposed to 
ithe material, they would still need to recall it, often after a 
period of several years. By contrast, some of the higher order 
thinking skills items were developed with generally well known 
information that students were required to apply. Had we not 
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tried to do thls» If a student got an Item requiring higher 
order thinking skills vrong^ there would be no way to know 
whether it was because the student did not have the requisite 
knowledge or whether the stud^'nt had uhe knowledge but could not 
use it. An example of such an item is provided In Exhibit 2. 
In this item, the only knowledge required concerns the concept 
that two nations would be more likely to work together when they 
each had abundant resources that the other needed. What we 
learned from items like this one is that these concept 
application items are often not as difficult as knowledge iteme 
assessing less commonly known Information. 



Exhibit 2 



Below are the names of some imaginary 
countries that are neighbors. In which 
of the following aituatloni^ would the 
two neighboring countries be MOST 
LIKELY to work together? 

A. Lam has coal but not enough wheat. 
Alf has coal but not enough cotton. 

B. Dara has oil but not enough food. 
Hondo has food but not enough oil. 

C. Clow has food but not enough water. 
Tarm has food but not enough wood. 

D. Kant has sugar but not enough copper. 
Nale has potatoes but not enough fish. 



Percentage of Students 
Selecting Each Option 
Gr. a 



5 



Example of science multiple choice Item measuring higher order 
thinking skill with familiar concept. 
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For further clarity^ we used a more systematic procedure called 
"nesting". In nesting^ several items were created to cover the same 
topical areas^ but at different levels of conceptualization. In this 
way^ when students perform poorly on questions requiring higher 
degrees of conceptual thought^ it can be determined more accurately 
whether that weakness was due to a lack of factual knowledge or 
whether the problem lay elsewhere. It is often the case that 
students can provide factual information^ but they lack the skills 
' cessary to successfully apply the information to problems usxng 
those same facts. An example of nesting is found in Exhibit 3. In 
the first question 59 percent of the eighth grade students indicated 
that when presented with the definition of supply and demand^ they 



Exhibit 3 



The price of a product is determined by 
the relationship between people's wants 
and needs^ and the availability of the 
product. This is called 



Percentage of students 
Selecting Each Option 



Gr. Q 



A. supply and demand. 

B. price fixing. 

C. black market. 

D. bartering. 



59» 
19 
12 
9 



If the law of supply and demand works^ 
the farmer will obtain the highest price 
for crops when 



A. both supply and demand are great. 

B. both supply and demand are low. 



41 
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C. supply is great and demand is low. 

D. supply is low and demand is great. 




Example of "nesting" using two social studies items. 
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could label it. However, on the second question, only 38 percent of 
the grade 8 students could apply that definition to a hypothetical 
situation requiring the understanding of an inverse correlation. 

Several psychologists and philosophers have discussed the 
importance of integrating critical and creative thinking and 
reasoning skills into subject areas. <See Glaser, 1984 and McPeck, 
1981. ) It therefore seems unconscionable to not devote our 
psychometric energies to continuing to develop compatible assessment 
strategies. (This applies not only to state tests but to local and 
national tests as veil. ) 



Teaching and Testing Thinking Skills Separately. 

As noted earlier, there are also many experts in the field of 
thinking skills vho advocate testing higher order thinking skills 
separately. Because he has authored several tests on critical 
thinking, Robert Ennis is often cited as one such expert. This is 
only partly true. In July, 1985 at a presentation at the University 
of Massachusetts Critical and Creative Thinking Program Summer 
Lecture Series, Ennis made clear his position that critical thinking 
skills should be assessed in both ways'-as infused and isolated. 
This is the current position of many of the experts who have been 
identified vith "isolating** thinking skills, and it is the position 
of the Connecticut State Department of Education as veil. At the 
present time, we have a statewide committee overseeing the 
development of a variety of approaches to assess higher order 
thinking and reasoning skills in the elementary and secondary 
grades. This is part of a larger effort to develop appropriate 
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objectives^ suggested instructional activities and learning 
strategies^ and staff development activities from kindergarden 
through high school. Two aspects of this larger program were 
pilot tested in the fall of 1984 in grade 4. These included a 
set of multiple choice items based upon Sternberg's xri archie 
theory of intelligence and a multiple choice test of critical 
thinking skills developed by Ennis (1985). 

When Sternberg was asked to help develop multiple choice 
test items based upon his triarchic theory of intelligence^ he 
selected the following 12 objectives as being appropriate for 
fourth grade students using a multiple choice format (see 
Sternberg and Baron» 1985) : 

!• Standard verbal analogies 

2. Counterfactual verbal analogies 

3. Standard number series 

4. Figural clasaif icatione 

5. Everyday inference 

6. Counterfactual everyday inferences 

7. Inferences about advertisements 

8. Linear syllogisms 

9. Spotting contradictions 

10. Learning from context 

11. Route planning 

12. Mathematical and logical insights 

The objectives developed by Ennis are presented below. The 
preliminr ry results of the test as well as a description of some 
protocol analyses are described in Ennis^ 1985. 



DEFINE AND CLARIFY 



1. Identify central issues and problems 

2. Identify conclusions 

3. Identify reasons 
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4. Identify appropriate questions to ask^ given the 
situation 

5. Identify assumptions 

JUDGE INPQRHATIQN 

&• Determine credibility of sources and observations 

7. Determine relevance 

8. Recognize inconsistency 

INFER: SOLVE PROBLEMS AND DRAW REASONABLE CONCLUSIONS 

9. Infer and judge inductive conclusions 
10. Deduce and judge deductive validity 

1 1 • Predict probable consequences 



It might be interesting to note that Ennis' items were pilot 
tested both as reading items and as listening items where the 
students also saw the item as it was read aloud. On average, 
the students performed about 6 percentage points better when 
they heard and saw the items, although there were some items 
where there were no differences between the two presentations 
and others on which students performed better when they read the 
items without hearing them read. This motivated us to ask Ennis 
to develop a cartoon version of the test to be used for children 
in elementary school. The cartoon version of the test is 
designed to reduce the reading load and be more motivating for 
elementary school children. We are currently pilot testing the 
cartoon version on the grade £ test in Fall 1965. 

The Challenge Ahead 

It has become increasingly more apparent that there is a 
larger payoff in teaching learning strategies than in teaching 
specific knowledge (Perkins, 1986). Furthermore, teachers 
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should '*teach for transfer"^ looking for applications of the 
same skills in a variety of contexts. We hope our assessment 
instrument will develop in parallel ways^ with attention paid to 
measuring the same thinking skills and strategies as applied to 
different subject areas. 

In trying to develop assessment approaches we recognize the 
need to develop activities and items that hav^ "ecological 
validity** or a high degree of verisimilitude. In other words^ 
the items should be similar to those that students will have to 
face in their lives. For example^ one of the desirable traits 
of good thinkers is that they persist in the face of failure. 
We are therefore looking at ways to incorporate persistence and 
sustained thought into our assessment. Certainly writing 
exercises can be ecologically valid and incorporate sustained 
thought. And certainly^ our science tests incorporated these 
traits into the practical section of the test requiring students 
to design and conduct an experiment. 

As de scribed in Baron (19QG) another fruitful area for 
evaluation is in the assessment of students' dispositions as 
they relate to students' thinking. Brandt, (1985); Costa, 
(1984); Duckworth, (1978); Ennis, (1986); Feuerstein, (1980); 
and Nickerson, (1986) have provided lists of dispositions of 
good thinkers. Efforts are currently underway in Connecticut to 
develop an inventory of thinking skills dispositions that can be 
used by teachers and administrators to monitor students' 
attitudes and dispositions. 

ERIC -44- 



In the past few years^ apart from discovering the 
inadequacies in students' thinking skills^ ve are beginning to 
better understand the issues^ problems^ and needs related to 
assessing thinking. If assessing thinking skills becomes a 
national priority^ ve can look forward to the collective wisdom 
of psychologists^ philosophers and psychometricians assisting 
educators to develop instruments that will more accurately 
determine the extent to which our students are becoming better 
thinkers. 
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THE ASSESSMENT OF HIGHER-ORDER THINKING SKILLS: 
RECENT DEVELOPMENTS 

John Fremer and Hark Daniel 
Th* Paychological Corporation 

This papar identifiaa thirtean davalopments related to the 
aaaeaament of higher order thinking skills (HOTS). In our 
listing and analysis ve attempt to bring together testing^ 
curriculun^ and instructional points of viev because ve are 
convinced that it is the users of test results vho have the 
greatest potential to help students and improve programs. The 
best a test can do is to provide information on a sample of 
student skills. It is the teacher in the classroom^ the 
curriculum supervisor^ the school administrator or the program 
evaluator vho must apply this information in an effective viy. 

It vill be useful to comment briefly on terminology. We viev 
'higher order thinking skills" as those skills that go beyond 
straight recall or learning of facts. They encompass a vide 
range of activities including problem identification and problem 
solving^ evaluation of information and of arguments^ deduction^ 
Inference^ taking alternate points of viev^ creating reasonable 
arguments in support of a position^ and making decisions. The 
term "critical thinking" often is used interchangeably vith 
higher order thinking^ but it also has a specialized meaning that 
denotes a formal approach to problem analysis and argument 
evaluation. 

Dispositions^ or motivational factors^ also are central to 
higher order thinking skills^ because vithout the desire to mal^a 




-48- 



good decisions and th» villingness to consider nev ideas^ the 
reasoning abilities listed above are unlikely to be called into 
play. A primary goal of higher order thinking instruction is to 
create m habit of reflective thinking. 

INSTRUCTION AMD TESTING 

A Major Trend 

Major attention is being devoted to the development of 
curriculum materials and tests directed at higher order thinking 
skills at all educational levels. According to Edward Glaser^ a 
founder of the critical -thinking movement and author of the 
Watson-Glaser Crltlcsal Thi nking Appraisal, the current interest 
in critical thinking is stronger and more widespread than at any 
time in the last 35 years. 

One way of tracking an educational movement is to look at 
prt. coverage. US News & World Report. January 14^ 1985^ had an 
article^ "Think. Now Schools Are Teaching How. " The article 
reported on an American Federation of Teachere survey indicating 
that six states out of 23 responding had passed laws mandating 
inat 'uction in critical thinking. It described available 
programs in the schools. On February 6p 1985^ the Hartford 
Courant ran a story that was headed "State Says Johnny Can Read - 
He's Ready to Reason. " The Sunday^ May 19^ 1985^ Cleveland Plain 
Dealer ran an article that had originally been written for 
Harper's magazine. "Why Johnny Can't Think." 



At th« Michigan Testing conft»rence in February 1985^ state 
testing staff from Florida and Connecticut described higher order 
thinking skills projects in their states. The emphasis in 
Florida is on testing higher levels of the Bloom taxonomy vithin 
content areas. The state is vorking on developing realistic 
standards for uverage-ability and high-ability students. The 
State of Connecticut asked the Psychological Corporation to build 
a higher order thinking skills test for grades 4, 6» and 8 to be 
part of the statevide mastery testing program. Connecticut has 
engaged in very thoughtful and careful planning^ bringing 
together ideas from many sources including Robert J. Sternberg of 
Yale and Robi^rt Ennis of the University of Illinois. 

Textbook and test companies are also very active. The 
Metronolitan Achievewent Test Sixth Edition <MAT-6) that is 
coming out in August 19QS has a Hxghor Order Thinking Skills 
score. In addition, conferences on higher order thinking skills 
are springing up all over the country. The Connecticut State 
Department of Education recently ran one, and it vas 
oversubscribed by 100X1 The International Conference on Critical 
Thinking and Education Reform attracts a groving number of 
participar^s from around the country, as does the Conference on 
Thinking (held at Harvard in 1984 )« 

Not Just a Reaction 

Part of what is happening can be interpreted as a reaction to 
the back-to' ^.he-basics movement* but other factors are clearly at 



vork. Some of the focua on higher order skills is a direct 
reaction to the amount of attention devoted to basic skills^ 
survival skills^ and minimum competency testxng. The concern of 
the Basic Skills/Minimum Competency tnovement vas to bring as many 
students as possible up to a specified minimum level of 
achievement. Exactly vhich group vas being focused on varied 
somewhat from place to place^ but it usually vas something like 
the bottom 25X of developed ability. Hovever^ all along people 
have been asking: "What about the average^ above average^ and 
gifted student 

On the other hand^ one of the reasons ve feel that the higher 
order skills movement is more than Just the opposite of the basic 
skills movement is that ve see evidence that curriculum 
developers and educators vant to improve the reasoning e ills of 
ALL students^ not Just the more able ones. In general^ 
instructional programs are not targeted tovard special groups^ 
such as gifted and talented students. Instead^ attention is 
Leing paid to cultivating student reasoning skills over the 
entire range of student ability. Programs that do focus on 
special groups may be aimed at lov-ability r ther than 
high-ability groups^ such as Inst'^umental Enrichment^ vhich is 
intended to give lov-achieving students the learning skills that 
vill help them perform better^ 

Recent scientific vork in the field psychology has provided a 
foundation for the current HOTS movement* There has been a great 
deal of attention paid to the processes involved in 



problem-solving and learning. Better thinkers may be seen to 
differ from less effective thinkers largely in how they approach 
problems, rather than in their "mental hardware. " This research 
base has naturally encouraged efforts to add training in 
cognitive processes to the school curriculum. 

Test data are another impetus to the HOTS movement. Test 
data have seen an extraordinary amount of use in virtually all 
recent analyses of education. Some reasonable interpretations of 
National Assessment of Educational Progress <NAEP) and state 
■•••••••'^'t data are being combined with misinterpretations of the 

Schoj.aft4,c? Aptitudif Test score decline data. NAEP's Reading. 
ThinkApq^ t Writing report (1979-80) points out that many 
students seem to lack the skills to evaluate the ideas that they 
take away from something they read. We believe that the NAEP 
data do indicate a need for better training of thinking 
abilities. However, people have also used tne score declines in 
the College Board's Scholastic Aotitudg Test and in the ACT to 
draw unwarranted inferences about thinking akillr. 

Modifiable Skills 

An important characteristics of the HOTS assessment movement 
is its emphasis on modifiable skills. This is in contrast with 
conceptions of human ability as somehow being fixed <a variation 
on the "nature versus nurture" controversy). We see people 
wanting to improve the reasoning skills of students, as opposed 
to merely using tests to classify students as being at different 
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levels. The HOTS movement appears to be based on the assumption 

that virtually all children can be taught certain problem-solving 

techniques^ strategies^ principles^ dispositions^ and habits of 

thought that vill improve their ability to deal with problems 

they encounter as students and as members of society. 

This emphasis on t«ffaching rather than classifying is a very 

positive development. It relates tc some other current trends: 

o preparing people to do better on tests (the computer SAT 
by Harcourt Brace Jovanovich was a smash hit); and 

o the interest in diagnosis in testing (the Stanford 
Diagnostic Reading and Mathematics Teats are very 
popular ) . 

Defining Content 

Tests can help to define the content of a movement. We did 
an extensive review of existing tests in the course of developing 
the Metropolitan Achievement Test Sixth Edition Higher Order 
Thinking Skills score. We found that some available tests seemed 
to go beyond what was appropriate for achievement testing; that 
is^ they included figural analogies or syllogistic reasoning 
materials that are not generally part of the elementary or high 
school curricuJ.um. Other materials were exclusively taxonomies 
that had not attempted to integrate the taxonomic terms with the 
various subject matter disciplines. Still others seemed too 
inclusive^ labeling as higher order thinking almost anything that 
went beyond the initial knowledge stage in any taxonomy. Existing 
tests of thinking skills are too varied to serve as a guide to 
the content of the hi^^her order thinking skills movement. 
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Perhaps the main challenge involved in developing teats of 
higher order thinking skills will be to select objectives from 
the broad domain. It is technically possible to test for a large 
variety of reasoning and problem-solving skills. However^ not 
all of these skills can be covered in any single test or perhaps 
even in any test program. Some of these skills are more 
important than others^ in the sense that they have wider 
applications in school and vork. For example^ to borrow from 
Robert Sternberg's terminology^ executive-processing skills such 
as planning and strategy selection may be more important than 
individual performance- component skills. Further^ objectives 
differ in how well they can be addressed in existing school 
courses* The selection of test content will be closely linked to 
experiments in teaching higher order thinking skills. 

Everyday AdoI icat ions 

A good deal of work is being devoted to how thinking skills 
can be cultivated so that students can analyze television news 
advertising^ political speeches^ and other everyday presentations 
of positions and arguments. Part of Edward Glaser's motivrtion 
in developing the Wataon-Glaser Critical Thinking Appraisal in 
the early 1940 's was to help students of all levels of ability to 
think more critically about important issues. This test» which 
h^s been revised in 1966 and again in 1960» measures Inference^ 
Recognition of Assumptions^ Deduction^ Interpretation^ and 
Evaluation of Arguments . The test is sensitive to instruction in 
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critical thinking^ and use of the inacrument ia expanding 
rapidly, 

Baaic Subiecta 

The curriculum areas where greatest attention appears to b& 
going to higher order thinking skills are reading, writing, 
mathematics, social studies and science. Higher level thinking 
is being addressed in the baaic subjects, not primarily in highly 
specialized and advanced subjects. The curriculum is being 
expanded by being given depth, not by adding new subjects. 

Wide Age Range 

One area of HOTS instruction and assessment that requires 
substantial exploration and research is the proper ages and 
developmental levels for teaching various thinking skills. As 
nev methods of teaching are tried in elementary schools, ve will 
learn more about the capacity of children of various ages to 
handle such things as designing experiments, analyzing the 
structure of an argument, and identifying relevant and irrelevant 
information. Lipman's Philosophy for Children program has shown 
that young children can not only learn some basic philosophical 
principles but also take an active interest in discussing them. 
It would be a mistake to assume that children of certain ages are 
unable to acquire particular reasoning skills without having made 
an effort to teach those skills in an appropriate fashion. The 
door is open to experimentation on this issue. 
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The extent to vhich thinking skills training vill be 
acceseible to students at all ability levels depends in large 
part on hov content is defined^ and on vhether thinking skills 
Instruction is embedded in subject areas or treated as a separate 
subject. 

Multiple Assessment Techniques 

Both objective multiple-choice tests and more open-ended 
tests are playing a role in the assessment of higher-order 
thinking skills. Multiple-choice tests are uniquely suited to 
certain assessment needs^ such as monitoring the performance of 
large numbers of children^ or measuring change over time. Many 
of the thinking skills are veil suited to measurement by 
multiple-choice item types. Hovever^ it is also true that some 
of the more complex thinking-sk .11 objectives can best^ or only^ 
be assessed by other means. When teaching a child to analyze an 
argument^ there is no better way of evaluating learning than 
asking the child to analyze an argument^ orally or in writing. 
Similarly^ if one wants to know whether students have developed 
the habit of selecting a problem-solving strategy before trying 
to sol'/e the problem^ the best approach probably is to observe 
the proctras. Thus^ although objectives testa can provid' useful 
information on thinking skills^ there will also be a need for a 
considerable amount of classroom-level assessment. 
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Validation 

On« of the most important issues a test developer in this 
field faces is how to validate a test of higher order thinking 
skills. There is no easy ani?^ver to the problem of validating new 
thinking-skills tests, because ve lack easily-available criteria 
for "good thinking. * We suspect that the task calls for a 
"bootstrap" approach. On the one hand, tests of thinking skills 
programs are effective* On the other, the increase <or lack of 
increase) in test scores following instruction in thinking skills 
indicates whether the test is a measure of the thinking abilities 
being taught. The closer the test content is to the target 
behaviors of the training, the more confidence we can have in the 
test's construct validity. With experience we will discover 
which types of tests are sensitive to certain types of training. 

Separate or Integrated ? 

There are clear differences of opinion among those who favor 
separate instructional units on reasoning skills and those who 
insist that such skills need to be addressed within existing 
curriculm areas. Schools have a number of "free-standing" 
instructional packages to choose from, such as Philosophy for 
Children or Instrumental Enrichment. A 1984 Educational Teoting 
Service <ETS) report Focus 15s Critical Thinking describes a 
number of programs in schools and colleges around the country. 
One of the people quoted in this article is Vincent Ruggiero, 
textbook author. 



Ruggiero argues^ 'You have to have a special cour&e for 
atudenta to learn the full range of critical thinking^ and other 
courses should reinforce vhat is learned. " He compares the 
critical thinking course to freshman English as a course teaching 
a fundamental skill necessary to succeed in all college courses. 
He also insists that, like writing, critical thinking should be a 
part of every other course, *No one argues that because freshman 
English is taught in college, no one else has to each writing. " 

Ruggiero's course vould cover p'^oblem solving and decision 
making, principles and techniques of creative thinking, 
overcoming attitudes that handicap thinking, and developing 
techniques for critiquing one's ovn arguments. The course vould 
also introduce students to the techniques and principles of 
persuasive vriting and provide them vith practice in the detailed 
expression of their ideas. 

On the same page, the argument is presented that critical 
thinking should be integrated into m *«ry subject in the 
curriculum and that establishing a -jeparate course is unnecessary 
and, in many cases, impractical. /le Critical and Creative 
Thinking Program at the Universit '^f Massachusetts at Boston, 
for instance, prepares teacherp iCorporate critical thinking 

into established courses. **! don t think you need to introduce a 
nev course to teach critical tninking, " says Robert Svartz. 
**Perhaps the best approach is to introduce critical thinking into 
the existing curriculum - to make it part of existing courses. 
Certainly, to introduce critical thinking as a separate course 



without making it part of the reat of the curriculum sends a 
mixed message to students. " 

In support of this position the ETS report quotes Barry K. 
Beyer, in an April 1984 Phil Delta Kappan article, as saying, 
■Research suaa eats that skills taught in isolation from subject 
matter are no t likelv to transfer easily to other situations 
where thev ca n be used productively . Research also suggests that 
skills taught in isolation from one another are not likely to 
become functional. Furthermore, research suggests that massed 
practice of skills is not as effective in promoting learning as 
intermittent practice and reinforcement over a long period of 
time. Thus the research that has been conducted seems to argue 
for sequential instruction in thinking skills across all subject 
areas and throughout all grades, K-12. Few such curricula exist, 
but they should be developed. " 

The ETS report goes on to describe an "integrative approach" 
that is being pioneered in the junior high schools of 
Pennsylvania's Neshaminy School District. Each of the district s 
three junior high schools employs a specialist who comes into 
regular classrooms to present units in critical thinking and 
philoaophy that are coordinated with the subject matter of the 
standard curriculum. 

Textbook publishers are working hard to emphasize the units 
on higher order skills in their existing materials and to make 
new materials sucn as Thinking Boxes and Packages. 
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Chooaing a Proaram 

If you have to choose a program, the approach recommended by 
Dr. Robert Sternberg may be helpful. He argues that the research 
that he and a colleague, Janet Davidson, have carried out 
supports the effectiveness of veil -executed theory -based training 
in higher order intellectual skills. He presents thirteen 
general principles for selecting and offering training programs. 

1. Clarify your purposes and needs for training. 

2. Choose programs with some real-vorld content, not all 
abstract materials. 

3. Choose programs that are motivating to teachers. 

4. Teach for transfer. 

5. Have an instructional theory. 

6* Address broad-ranging intellectual skills, not narrov 
test-item content. 

7. Teach children hov to learn, so they can keep on groving. 

8. Use multiple teaching approaches. 

9. Provide an integrated program. 

10. Use socioculturally appropriate materials. 

11. Be responsive to individual differences. 

12. Find children's strengths and capitalize on them. Help 
children recognize and deal vith their weaknesses. 

13» Eliminate barriers to using intellectual skills. 

Testing Teachers 

The issues of upgrading teacher as veil as student thinking 
skills is receiving attention in instruction and assessment. One 
of the places ve sought help in preparing this paper vas ETS. 
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ETS is nov receiving requeata to help upgrade the thinking akilla 
of teachers. Some propose that there be course work for teachers 
in reasoning akilla, followed by certification testing in 
reasoning. 

Breadth of Movement 

The number of different currents of thought and research that 
are being brought together under the higher order thinking skills 
banner is quite remarkable: 

o philosophers -* (formal and informal logic. Philosophy for 
Children, dialectical thinking) 

o state assessment staff 

o curriculum designers 

o cognitive psychologists 

o teat developers 

o veterans in the area and newcomera 

o people working at all levels of education 

Cognitive psychology^, in particular, has had some influence, but 
we think it has potential for a great deal more. Often it takes 
a long time for the findings of cognitive psychological research 
to be applied to educational practice. We have already alluded to 
several questions concerning higher order thinking instruction 
and assessment that need to be addressed by research. If we will 
work harder in testing and instruction to involve the research 
community in our development activities^ we see substantial 
payoffs being possible* 
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